Personalizing PageRank Based on Domain Profiles
نویسندگان
چکیده
Personalized versions of PageRank have been proposed to rank the results of a search engine based on a user’s topic or query of interest. This paper introduces a methodology for personalizing PageRank vectors based on URL features such as Internet domains. Users specify interest profiles as binary feature vectors where a feature corresponds to a DNS tree node. Given a profile vector, a weighted PageRank can be computed assigning a weight to each URL based on the match between the URL and the profile features. We present promising preliminary results from a small experiment in which users were allowed to select among nine URL features combining the top two levels of the DNS tree, leading to 2 pre-computed PageRank vectors from a Yahoo crawl. Personalized PageRank performed favorably compared to pure similarity based ranking and traditional PageRank.
منابع مشابه
SpamRank -- Fully Automatic Link Spam Detection
Spammers intend to increase the PageRank of certain spam pages by creating a large number of links pointing to them. We propose a novel method based on the concept of personalized PageRank that detects pages with an undeserved high PageRank value without the need of any kind of white or blacklists or other means of human intervention. We assume that spammed pages have a biased distribution of p...
متن کاملAn Analytical Comparison of Approaches to Personalizing PageRank
PageRank, the popular link-analysis algorithm for ranking web pages, assigns a query and user independent estimate of “importance” to web pages. Query and user sensitive extensions of PageRank, which use a basis set of biased PageRank vectors, have been proposed in order to personalize the ranking function in a tractable way. We analytically compare three recent approaches to personalizing Page...
متن کاملUsing Hyperlink Features to Personalize Web Search
Personalized search has gained great popularity to improve search effectiveness in recent years. The objective of personalized search is to provide users with information tailored to their individual contexts. We propose to personalize Web search based on features extracted from hyperlinks, such as anchor terms or URL tokens. Our methodology personalizes PageRank vectors by weighting links base...
متن کاملPersonalizing PageRank-Based Ranking over Distributed Collections
In distributed work environments, where users are sharing and searching resources, ensuring an appropriate ranking at remote peers is a key problem. While this issue has been investigated for federated libraries, where the exchange of collection specific information suffices to enable homogeneous TFxIDF rankings across the participating collections, no solutions are known for PageRank-based ran...
متن کاملPersonalizing PageRank for Word Sense Disambiguation
In this paper we propose a new graph-based method that uses the knowledge in a LKB (based on WordNet) in order to perform unsupervised Word Sense Disambiguation. Our algorithm uses the full graph of the LKB efficiently, performing better than previous approaches in English all-words datasets. We also show that the algorithm can be easily ported to other languages with good results, with the onl...
متن کامل